Generally, an audio signal encoding apparatus compresses an audio signal into a mono or stereo type downmix signal instead of compressing each channels of a multi-channel audio signal. The audio signal encoding apparatus transfers the compressed downmix signal to a decoding apparatus together with a spatial information signal (or, ancillary data signal) or stores the compressed downmix signal and the spatial information signal in a storage medium.
In this case, the spatial information signal, which is extracted in downmixing a multi-channel audio signal, is used in restoring an original multi-channel audio signal from a compressed downmix signal.
The spatial information signal includes a header and spatial information. And, configuration information is included in the header. The header is the information for interpreting the spatial information.
An audio signal decoding apparatus decodes the spatial information using the configuration information included in the header. The configuration information, which is included in the header, is transferred to a decoding apparatus or stored in a storage medium together with the spatial information.
An audio signal encoding apparatus multiplexes an encoded downmix signal and the spatial information signal together into a bitstream form and then transfers the multiplexed signal to a decoding apparatus. Since configuration information is invariable in general, a header including configuration information is inserted in a bitstream once. Since configuration information is transmitted with being initially inserted in an audio signal once, an audio signal decoding apparatus has a problem in decoding spatial information due to non-existence of configuration information in case of reproducing the audio signal from a random timing point. Namely, since an audio signal is reproduced from a specific timing point requested by a user instead of being reproduced from an initial part in case of a broadcast, VOD (video on demand) or the like, it is unable to use configuration information transferred by being included in an audio signal. So, it may be unable to decode spatial information.